Attribute Access and Descriptors ================================ We've covered the syntax for accessing attributes, and we've covered how to create new classes in Python. We talked about one special method -- ``__init__``. We're going to cover several special methods that all have to do with attribute access. The Four Things You Do With Attributes --------------------------------------- It's important to remember the 4 things you can do with attributes. These are the same as the things you can do with any namespace, as attributes for a namespace for objects. 1. Declare a new attribute (and give it an initial value.) 2. Access the attribute, retrieving its value. 3. Assign to the attribute, giving it a new value. 4. Delete the attribute, removing it from the namespace. As for variables, you declare an attribute by assigning to it for the first time, so we will treat these two cases as just "assignment". When we go about messing with how Python handles attributes, we need to think about what it is we are trying to accomplish and why. Here are some common scenarios I encounter. Default Attributes """""""""""""""""" Sometimes I want certain default attributes to exist whether or not I assign to them in ``__init__``. While I can use class attributes to handle this, I can also customize the attribute access pattern. Calculated Attributes """"""""""""""""""""" Some attributes may be calculated from the object. For instance, in the case of complex numbers, the magnitude of the number is calculated as the square root of the sum of the squares of the real and imaginary parts. I don't want to calculate this for every complex number, but I do want it to appear as an attribute. I can modify how the attribute is accessed such that instead of looking in the ``__dict__`` I can calculate it on the fly. Note that I don't recommend this sort of behavior. It is surprising to see that an attribute access has caused a function to be called. It's weird to get exceptions from them. Cached Attribute Values """"""""""""""""""""""" Sometimes want to cache values after we have accessed them, and we can store them as an attribute. I'll cover how to do this in different ways, and give my recommendations. Indefinite Attributes """"""""""""""""""""" Sometimes you don't know what attributes your object will have, or there is virtually an unlimited number of possibilities. In this case, you simply can't assign all the possible attributes, so you need to calculate them instead. Remembering Attribute Manipulation """""""""""""""""""""""""""""""""" Another special case arises in things like SQLAlchemy. SQLAlchemy allows you to create objects that represent rows in database tables. Attributes are the columns of that row. Assigning to an object's attribute signals your intention to update a column in that row. However, SQLAlchemy is written such that you can accumulate all the changes you'd like to make in a transaction before flushing them out to the database. Limiting Attribute Values """"""""""""""""""""""""" We might want to limit what values we can assign to certain attributes. Usually we're interested in only values of a specific type but sometimes we might want to enforce limits to the values. For instance, it makes no sense to have circles with negative radii, so we might want to limit the values of ``radius`` to numbers greater than or equal to 0. ``__getattribute__(self, name)`` -------------------------------- The "granddaddy" of all the attribute access special methods is ``__getattribute__``. The ``object`` base class of all base classes for all time ever defines its own ``__getattribute__`` that implements all of the magical powers I describe in this video. I have never written this method for any class that I have ever defined in all my years of Python programming. So I'll tell you what it does and the one or two cases where you might be tempted to use it, and why you should use something else instead. This method is called for *every* attribute access. If the attribute is in the namespace, or a descriptor, or handled by the attribute access methods, this is still called. (Note, by *every* I really mean *everything you can think of right now.* There are special attributes that will not be accessed through this method, for everyone's sanity.) If you were to write this method, either you would need to call the attribute access methods, descriptors, etc... then you would need to write that code. Alternatively, you can raise an ``AttributeError`` and Python will then proceed to pretend that this method doesn't exist. Why would you do this? Well, you might want to have very special rules for when to create a new attribute, what values to assign to attributes, or how to calculate an attribute on the fly, or what to do when you delete an attribute. However, each of these cases are handled by the methods below exactly the way you imagine it should be. The only case where I think this might be useful is if you want some sort of super-descriptor scenario, but even then, I'd encourage you to find a way to make descriptors work. When you include metaclasses in your arsenal, you'll find that mucking with this is truly unnecessary. ``__dir__(self)`` ----------------- This returns the list of attributes, when the object is called as an argument to ``dir()``. I never override this, and I don't think you should either. ``__getattr__(self, name)`` --------------------------- This returns a value given a name, but only if the name does not exist in ``__dict__``. The neat thing about this is that you don't even need the name passed in to be stored in the ``__dict__`` of the object. You can just make things up as needed. Here's an example: .. code:: python class Attributor: def __getattr__(self, name): return name a = Attributor() a.foo # -> 'foo' a.foo = 6 a.foo # -> 6, because 'foo' is in self.__dict__ It's been a long time since I've found a good use for this. Descriptors just do this better. The only case where this might be beneficial is in the indefinite number of attributes scenario. ``__setattr__(self, name, value)`` ---------------------------------- This special method will be called when you try to assign to any attribute. And by "any", I really mean "any". If you want to proceed with the normal assignment behavior of attributes, you *must* call ``super`` or just ``object.__setattr__(self, name, value)``. You may also assign directly to the ``__dict__`` attribute of the object. In this example, I have Vector class that allows you to set the x, y, and z parameters by attribute assignment. .. code:: python class Vector3: def __init__(self): self.vector = [0, 0, 0] def __setattr__(self, name, value): if name == 'x': self.vector[0] = value elif name == 'y': self.vector[1] = value elif name == 'z': self.vector[2] = value else: object.__setattr__(self, name, value) Like ``__getattr__``, it's been a long time since I've seen a use for this. Descriptors do it better. The only case where this might be beneficial is in the indefinite number of attributes scenario. Remember that *every* attribute assignment will go through this function, so plan accordingly! ``__delattr__(self, name)`` --------------------------- This is parallel to ``__setattr__`` except for attribute deletion. I don't think I've ever written this in my entire career. It may be useful if deleting an attribute is meaningful. I must not have a good imagination because I can't think of any case where that would be the case. Attribute Dictionary -------------------- Given the three methods above, you might get the idea that you can write an Attribute Dictionary, a class that behaves like a dictionary but also allows you to access the values via the attribute syntax. This has been done multiple times. I don't encourage it, however, for a number of reasons, not the least of which is you can have keys in the dictionary that are not valid attribute names, and you might have collisions between actual attributes and dictionary elements. Descriptors ----------- At this point, I get to introduce one of the truly unique and scary ideas in Python: Descriptors. The confusing part about descriptors is understanding exactly *when* the special methods get called. In short, if the descriptor is an attribute of an instance, not the *class*, the *instance*, then it is not treated as a descriptor, and the attribute access does nothing special. But there is one important exception: classes which have attributes that are descriptors, when accessed directly *through* the class, have the descriptor special methods invoked. We call classes with attributes that are descriptors *owner classes*. Let me try to simplify this all. Suppose we have the following: - A descriptor called ``desc`` that defines the special methods ``__get__``, ``__set__``, or ``__delete__``. - A class called ``Owner`` that has an attribute called ``desc`` that is a descriptor. - An instance of class ``Owner`` called ``inst`` but does not have an attribute called ``desc`` assigned at the instance level. (Meaning, you never called ``inst.__dict__['desc'] = ...``. If you tried ``inst.desc = ...`` that would invoke the special methods.) These are the only four cases where the descriptor methods get called: - If you call the method directly: ``desc.__get__(...)``, ``desc.__set__(...)`` or ``desc.__delete__(...)``. - When you access, assign, or delete ``desc`` through ``inst``: ``inst.desc``, ``inst.desc = x``, ``del inst.desc``. - When you access ``desc`` through the class ``Owner``: ``Owner.desc``. Deleting or assigning ``desc`` through the class ``Owner`` does not invoke the special methods. - Something to do with ``super`` that we'll cover in inheritance and really not that important because it hardly ever becomes an issue. If you're thoroughly confused, don't feel bad. This *is* confusing. The way I remember it is as follows: - Descriptors that are just variables are *not* special. - Instances with descriptor attributes defined at that level are *not* special. It is just like descriptors that are variables. - Classes with descriptor attributes *are* special, both for the class and instances of that class. Example """"""" Here's some code that will help clarify things. .. code:: python class Desc: def __get__(self, instance, owner): print(f"__get__({self}, {instance}, {owner})") def __set__(self, instance, value): print(f"__set__({self}, {instance}, {value})") def __delete__(self, instance): print(f"__delete__({self}, {instance})") desc = Desc() # desc is a descriptor # desc behaves like a normal variable. There is no special behavior. desc # Accessing the methods directly through desc, the variable. desc.__get__(1,2) # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, 1, 2) desc.__set__(1,2) # __set__(<__main__.Desc object at 0x000001F0C17B8D68>, 1, 2) >>> desc.__delete__(1) # __delete__(<__main__.Desc object at 0x000001F0C17B8D68>, 1) # Creating an owner class class Owner: pass # The specification does not need to be in the class spec, but it can be. Owner.desc = desc # Accessing the descriptor as an attribute of the class will invoke __get__. # Note the arguments! Owner.desc # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, None, ) # Deleting the attribute will just delete the descriptor. No special method. del Owner.desc Owner.desc #Traceback (most recent call last): # File "", line 1, in # Owner.desc # AttributeError: type object 'Owner' has no attribute 'desc' # Restoring the descriptor Owner.desc = desc Owner.desc # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, None, ) # Assigning will overwrite it, no special method. Owner.desc = 2 Owner.desc # 2 # Restoring the descriptor Owner.desc = desc # inst is an instance of Owner. inst = Owner() # Accessing through inst invokes the __get__method. Note the arguments! inst.desc # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>, ) # Assigning through inst invokes the __set__ method. Note the arguments! inst.desc = 5 # __set__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>, 5) # Deleting through inst invokes the __del__method. Note the arguments! del inst.desc # __delete__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>) # Remove the descriptor from Owner. del Owner.desc # There is no desc attribute on inst anymore. inst.desc #Traceback (most recent call last): # File "", line 1, in # inst.desc #AttributeError: 'Owner' object has no attribute 'desc' # Let's assign the desc to the instance. inst.desc = desc # Accessing does nothing special. inst.desc #<__main__.Desc object at 0x000001F0C17B8D68> # Assignment does nothing special inst.desc = 2 inst.desc #2 # Resetting inst.desc = desc # Deleting does nothing special. del inst.desc inst.desc #Traceback (most recent call last): # File "", line 1, in # inst.desc #AttributeError: 'Owner' object has no attribute 'desc' Data vs. Non-data Descriptors """"""""""""""""""""""""""""" You may hear people talk about "data" or "non-data" descriptors. This is rather easy to explain: - Data Descriptors have either or both ``__set__`` and ``__delete__`` defined. - Non-data descriptors don't have either defined. Note that you *can* have a descriptor that doesn't have ``__get__`` defined, but I have never seen a use for this. The net effect of such a descriptor is that you can *only* assign to it or delete. If you tried to access it, it would raise an ``AttributeError``. ``__get__(self, instance, owner=None)`` """"""""""""""""""""""""""""""""""""""" This special method is called whenever the descriptor is accessed using one of the four methods above. Note that ``owner`` may or may not be set to something other than ``None``. - If the descriptor was accessed directly through the class, IE, ``desc_class.desc``, then ``owner`` is ``None`` and ``instance`` is ``desc_class``. - If the descriptor was accessed through an instance of the class, IE, ``instance.desc``, then ``instance`` is ``instance`` and ``owner`` is ``desc_class``. You should return the value that should be the value of the attribute for this descriptor. Note that you are *not* given the name that was used to find the attribute, so unless you stored it previously, you won't have that available. (We'll talk about ``__set_name__`` later.) This is actually a pretty big problem to solve and it causes a bit of a headache. See, each instance of a class with a descriptor for an attribute is using the *same* descriptor for attribute access. This means you need to know in this code *which* attributes to look at in the instance to calculate the value of this attribute access lookup. However, with a bit of imagination, you can come up with good solutions. And Python 3.6 gave us ``__set_name__``, which will help. This function should either return a value (remember that no return statement means ``return None``) or raise ``AttributeError`` if the attribute shouldn't exist. Typically, especially in the case of cachign attribute values, you'll want to store the value you calculated in the instance of the class so that you don't have to recalculate it again. This means that I typically see the following pattern for this method: .. code:: python class MyDescriptor: def __get__(self, instance, owner=None): if owner: # We're being accessed through an instance try: return instance._cached[self.name] except KeyError: pass value = ... instance._cached[self.name] = value return value else: # We're being access through the class return self In order to use the descriptor, we need to do something like this: .. code:: python class MyClass: foo = MyDescriptor() foo.name = 'foo' Using the descriptor looks like this: .. code:: python MyClass.foo # -> __get__(self, MyClass, None) a = MyClass() a.foo # -> __get__(self, a, MyClass) When should we use descriptors? Pretty much anytime we want to override the default behavior of attribute access. There may be a descriptor you want that does what you want (we'll look at ``property``, ``classmethod`` and ``staticmethod`` in this video), so I'd typically use one of those, especially ``property``. Rarely do I ever write an entirely new descriptor. ``__set__(self, instance, value)`` """""""""""""""""""""""""""""""""" This is called when you try to assign to a descriptor using one of the four methods I mentioned early. Again, we run into the same problem we have with ``__get__`` and names of attributes. Unless you've recorded the name of the attribute, you won't know what name the attribute was accessed through. Typically, we use ``__set__`` to modify the value before storing it, especially if we want to make sure that the value is of an acceptable type or value. However, we might also want to store the value in something other than the attribute's namespace under the same name. Here's a typical pattern I might see for a ``__set__`` method: .. code:: python class Desc: def __set__(self, instance, value): instance.__dict__['_'+self.name] = int(value) And the ways it might get invoked: .. code:: python class Owner: foo = Desc() foo.name = 'foo' Owner.foo = "5" # Owner._foo <- 5 instance = Owner() instance.foo = 7.0 # instance._foo <- 7 As for ``__get__``, I don't tend to write my own descriptors. ``property`` actually does everything I need. ``__delete__(self, instance)`` """"""""""""""""""""""""""" This method, not to be confused with ``__del__``, which is invoked when the object is garbage collected, is invoked when a descriptor is deleted under one of the four special ways mentioned above. I don't have a whole lot to say about this, other than you probably want to use ``property`` instead. ``__set_name__(self, owner, name)`` """"""""""""""""""""""""""""""""""" This isn't really a descriptor method, but it goes along closely with it. What python does is when it creates the class (in ``type(name, bases, dict)``) it searched the ``dict``, the namespace of the class, looking for values with this method. If it sees it, then it will call ``value.__set_name__(value, new_class, name)``, where the name is the name of the value in that namespace. It's quite convenient, especially when you think about how hard it is to get the name of the attribute. If we didn't have this, we have to explicitly set the name, as I did in the examples above. We could use descriptors, or parameters to new instances of objects or whatnot. This greatly simplifies that process. Built-In Descriptors -------------------- There are three built-in descriptors that I will mention here. All three are decorators, and serve a very special purpose. ``@staticmethod`` """""""""""""""" Sometimes you want a method on a class that doesn't rely at all on the class attributes or instance attributes. In this case, ``staticmethod`` provides a convenient decorator: .. code:: python class MyClass: @staticmethod def sum(a, b): return a+b MyClass.sum(1,2) # 3 a = MyClass() a.sum(1,5) # 6 ``@classmethod`` """"""""""""""" Parallel to ``@staticmethod`` is this decorator, which ensures that the first parameter is always the class, even when it is invoked from an instance of the class. .. code:: python class MyClass: @classmethod def what_class_am_i(cls): return cls MyClass.what_class_am_i() a = MyClass() a.what_class_am_i() This is convenient because when you invoke classmethods without it, you have to pass in *something* as the first parameter. I use this especially when I am creating a singleton-style class. Why not just use the class as the singleton? You'll see things like this a lot with libraries like Flask and CherryPy. ``property`` """""""""""" This descriptor is arguable one of the most useful descriptors ever invented. Indeed, it can be said that descriptors were invented specifically to make ``property`` possible. 99.9% of the time, when you want to modify how attributes are accessed, assigned, or deleted, ``property`` has you covered. Let's look at how it might be used: .. code:: python class MyClass: @property def a(self): return self.b*6 @a.setter def a(self, value): self.b = value/6 @a.deleter def a(self): del self.b MyClass.a i = MyClass() i.a # Attribue Error - no b! i.a = 1 i.b # 0.16666... i.a # 1.0 del i.a i.b # AttributeError i.a # AttributeError -- no b! Note that assigning to or deleting ``MyClass.a`` will obliterate the descriptor, removing it completely. Keep in mind that creating setters or deleters for the property is entirely optional. It is really fun to figure out how ``property`` is implemented. Keep in mind the following: - After the setter and deleter decorators are applied, Python will assign the results to the ``a`` variable in the class namespace. How does ``property`` not overwrite itself? - How does the ``property`` know to do the right thing for ``MyClass.a = ...`` and ``del MyClass.a``? ``__slots__`` ------------- Before I let you go, I want to mention ``__slots__``. In the class suite, if you define a variable called ``__slots__``, then Python won't create a ``__dict__`` for the class. Instead, it reserves a spot for each of the attribute names listed in ``__slots__``. If you try to assign new attributes, an ``AttributeError`` will be raised. ``__slots__`` creates a special descriptor for each attribute. That means you can't use the class attributes as defaults -- they will get overridden. There are some more caveats and warnings and I encourage you to read the Python documentation on it if you think this might be useful. Speaking of which, when is this useful? If you're creatings tons and tons of instances and you are worried about the overhead of each object having its own ``__dict__``, this is a way you can reduce the creation time and memory overhead of creating new instances of the class. That's about all it is useful for.